{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Copyright (c) Microsoft Corporation. All rights reserved.*\n",
    "\n",
    "*Licensed under the MIT License.*\n",
    "\n",
    "# Text Classification of MultiNLI Sentences using BERT"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Before You Start\n",
    "\n",
    "> **Tip**: If you want to run through the notebook quickly, you can set the **`QUICK_RUN`** flag in the cell below to **`True`** to run the notebook on a small subset of the data and a smaller number of epochs. \n",
    "\n",
    "The table below provides some reference running time on different machine configurations.  \n",
    "\n",
    "|QUICK_RUN|Machine Configurations|Running time|\n",
    "|:---------|:----------------------|:------------|\n",
    "|True|4 **CPU**s, 14GB memory| ~ 15 minutes|\n",
    "|False|4 **CPU**s, 14GB memory| ~19.5 hours|\n",
    "|True|1 NVIDIA Tesla K80 GPUs, 12GB GPU memory| ~ 3 minutes |\n",
    "|False|1 NVIDIA Tesla K80 GPUs, 12GB GPU memory| ~ 1.5 hours|\n",
    "\n",
    "If you run into CUDA out-of-memory error or the jupyter kernel dies constantly, try reducing the `BATCH_SIZE` and `MAX_LEN`, but note that model performance will be compromised. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Set QUICK_RUN = True to run the notebook on a small subset of data and a smaller number of epochs.\n",
    "QUICK_RUN = False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append(\"../../\")\n",
    "import os\n",
    "import json\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import scrapbook as sb\n",
    "from sklearn.metrics import classification_report, accuracy_score\n",
    "from sklearn.preprocessing import LabelEncoder\n",
    "from sklearn.model_selection import train_test_split\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "\n",
    "from utils_nlp.dataset.multinli import load_pandas_df\n",
    "from utils_nlp.eval.classification import eval_classification\n",
    "from utils_nlp.models.bert.sequence_classification import BERTSequenceClassifier\n",
    "from utils_nlp.models.bert.common import Language, Tokenizer\n",
    "from utils_nlp.common.timer import Timer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "In this notebook, we fine-tune and evaluate a pretrained [BERT](https://arxiv.org/abs/1810.04805) model on a subset of the [MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) dataset.\n",
    "\n",
    "We use a [sequence classifier](../../utils_nlp/models/bert/sequence_classification.py) that wraps [Hugging Face's PyTorch implementation](https://github.com/huggingface/pytorch-pretrained-BERT) of Google's [BERT](https://github.com/google-research/bert)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "tags": [
     "parameters"
    ]
   },
   "outputs": [],
   "source": [
    "TRAIN_DATA_FRACTION = 1\n",
    "TEST_DATA_FRACTION = 1\n",
    "NUM_EPOCHS = 1\n",
    "\n",
    "if QUICK_RUN:\n",
    "    TRAIN_DATA_FRACTION = 0.01\n",
    "    TEST_DATA_FRACTION = 0.01\n",
    "    NUM_EPOCHS = 1\n",
    "\n",
    "if torch.cuda.is_available():\n",
    "    BATCH_SIZE = 32\n",
    "else:\n",
    "    BATCH_SIZE = 8\n",
    "\n",
    "DATA_FOLDER = \"./temp\"\n",
    "BERT_CACHE_DIR = \"./temp\"\n",
    "LANGUAGE = Language.ENGLISH\n",
    "TO_LOWER = True\n",
    "MAX_LEN = 150\n",
    "BATCH_SIZE_PRED = 512\n",
    "TRAIN_SIZE = 0.6\n",
    "LABEL_COL = \"genre\"\n",
    "TEXT_COL = \"sentence1\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Read Dataset\n",
    "We start by loading a subset of the data. The following function also downloads and extracts the files, if they don't exist in the data folder.\n",
    "\n",
    "The MultiNLI dataset is mainly used for natural language inference (NLI) tasks, where the inputs are sentence pairs and the labels are entailment indicators. The sentence pairs are also classified into *genres* that allow for more coverage and better evaluation of NLI models.\n",
    "\n",
    "For our classification task, we use the first sentence only as the text input, and the corresponding genre as the label. We select the examples corresponding to one of the entailment labels (*neutral* in this case) to avoid duplicate rows, as the sentences are not unique, whereas the sentence pairs are."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = load_pandas_df(DATA_FOLDER, \"train\")\n",
    "df = df[df[\"gold_label\"]==\"neutral\"]  # get unique sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>genre</th>\n",
       "      <th>sentence1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>government</td>\n",
       "      <td>Conceptually cream skimming has two basic dime...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>telephone</td>\n",
       "      <td>yeah i tell you what though if you go price so...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>travel</td>\n",
       "      <td>But a few Christian mosaics survive above the ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>slate</td>\n",
       "      <td>It's not that the questions they asked weren't...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>travel</td>\n",
       "      <td>Thebes held onto power until the 12th Dynasty,...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         genre                                          sentence1\n",
       "0   government  Conceptually cream skimming has two basic dime...\n",
       "4    telephone  yeah i tell you what though if you go price so...\n",
       "6       travel  But a few Christian mosaics survive above the ...\n",
       "12       slate  It's not that the questions they asked weren't...\n",
       "13      travel  Thebes held onto power until the 12th Dynasty,..."
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[[LABEL_COL, TEXT_COL]].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The examples in the dataset are grouped into 5 genres:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "telephone     27783\n",
       "government    25784\n",
       "travel        25783\n",
       "fiction       25782\n",
       "slate         25768\n",
       "Name: genre, dtype: int64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[LABEL_COL].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We split the data for training and testing, and encode the class labels:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/data/anaconda/envs/nlp_gpu/lib/python3.6/site-packages/sklearn/model_selection/_split.py:2179: FutureWarning: From version 0.21, test_size will always complement train_size unless both are specified.\n",
      "  FutureWarning)\n"
     ]
    }
   ],
   "source": [
    "# split\n",
    "df_train, df_test = train_test_split(df, train_size = TRAIN_SIZE, random_state=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_train = df_train.sample(frac=TRAIN_DATA_FRACTION).reset_index(drop=True)\n",
    "df_test = df_test.sample(frac=TEST_DATA_FRACTION).reset_index(drop=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "# encode labels\n",
    "label_encoder = LabelEncoder()\n",
    "labels_train = label_encoder.fit_transform(df_train[LABEL_COL])\n",
    "labels_test = label_encoder.transform(df_test[LABEL_COL])\n",
    "\n",
    "num_labels = len(np.unique(labels_train))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of unique labels: 5\n",
      "Number of training examples: 78540\n",
      "Number of testing examples: 52360\n"
     ]
    }
   ],
   "source": [
    "print(\"Number of unique labels: {}\".format(num_labels))\n",
    "print(\"Number of training examples: {}\".format(df_train.shape[0]))\n",
    "print(\"Number of testing examples: {}\".format(df_test.shape[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tokenize and Preprocess"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before training, we tokenize the text documents and convert them to lists of tokens. The following steps instantiate a BERT tokenizer given the language, and tokenize the text of the training and testing sets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 78540/78540 [00:27<00:00, 2841.38it/s]\n",
      "100%|██████████| 52360/52360 [00:18<00:00, 2834.92it/s]\n"
     ]
    }
   ],
   "source": [
    "tokenizer = Tokenizer(LANGUAGE, to_lower=TO_LOWER, cache_dir=BERT_CACHE_DIR)\n",
    "\n",
    "tokens_train = tokenizer.tokenize(list(df_train[TEXT_COL]))\n",
    "tokens_test = tokenizer.tokenize(list(df_test[TEXT_COL]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In addition, we perform the following preprocessing steps in the cell below:\n",
    "- Convert the tokens into token indices corresponding to the BERT tokenizer's vocabulary\n",
    "- Add the special tokens [CLS] and [SEP] to mark the beginning and end of a sentence\n",
    "- Pad or truncate the token lists to the specified max length\n",
    "- Return mask lists that indicate paddings' positions\n",
    "- Return token type id lists that indicate which sentence the tokens belong to (not needed for one-sequence classification)\n",
    "\n",
    "*See the original [implementation](https://github.com/google-research/bert/blob/master/run_classifier.py) for more information on BERT's input format.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "tokens_train, mask_train, _ = tokenizer.preprocess_classification_tokens(\n",
    "    tokens_train, MAX_LEN\n",
    ")\n",
    "tokens_test, mask_test, _ = tokenizer.preprocess_classification_tokens(\n",
    "    tokens_test, MAX_LEN\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Model\n",
    "Next, we create a sequence classifier that loads a pre-trained BERT model, given the language and number of labels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "classifier = BERTSequenceClassifier(\n",
    "    language=LANGUAGE, num_labels=num_labels, cache_dir=BERT_CACHE_DIR\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train\n",
    "We train the classifier using the training examples. This involves fine-tuning the BERT Transformer and learning a linear classification layer on top of that:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "t_total value of -1 results in schedule not being applied\n",
      "Iteration:   0%|          | 0/2455 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning: Only 1 CUDA device is available. Data parallelism is not possible.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "Iteration:   0%|          | 1/2455 [00:01<1:21:44,  2.00s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:1->246/2455; average training loss:1.653734\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  10%|█         | 247/2455 [07:39<1:09:04,  1.88s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:247->492/2455; average training loss:0.376494\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  20%|██        | 493/2455 [15:23<1:01:48,  1.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:493->738/2455; average training loss:0.314981\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  30%|███       | 739/2455 [23:06<53:42,  1.88s/it]  "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:739->984/2455; average training loss:0.286209\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  40%|████      | 985/2455 [30:50<46:17,  1.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:985->1230/2455; average training loss:0.265873\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  50%|█████     | 1231/2455 [38:33<38:29,  1.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:1231->1476/2455; average training loss:0.252521\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  60%|██████    | 1477/2455 [46:16<30:38,  1.88s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:1477->1722/2455; average training loss:0.243316\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  70%|███████   | 1723/2455 [54:00<23:04,  1.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:1723->1968/2455; average training loss:0.235114\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  80%|████████  | 1969/2455 [1:01:44<15:14,  1.88s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:1969->2214/2455; average training loss:0.229056\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration:  90%|█████████ | 2215/2455 [1:09:26<07:30,  1.88s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch:1/1; batch:2215->2455/2455; average training loss:0.223192\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration: 100%|██████████| 2455/2455 [1:16:56<00:00,  1.57s/it]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Training time: 1.283 hrs]\n"
     ]
    }
   ],
   "source": [
    "with Timer() as t:\n",
    "    classifier.fit(\n",
    "        token_ids=tokens_train,\n",
    "        input_mask=mask_train,\n",
    "        labels=labels_train,    \n",
    "        num_epochs=NUM_EPOCHS,\n",
    "        batch_size=BATCH_SIZE,    \n",
    "        verbose=True,\n",
    "    )    \n",
    "print(\"[Training time: {:.3f} hrs]\".format(t.interval / 3600))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Score\n",
    "We score the test set using the trained classifier:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning: Only 1 CUDA device is available. Data parallelism is not possible.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration: 100%|██████████| 103/103 [18:00<00:00,  8.24s/it]\n"
     ]
    }
   ],
   "source": [
    "preds = classifier.predict(token_ids=tokens_test, \n",
    "                           input_mask=mask_test, \n",
    "                           batch_size=BATCH_SIZE_PRED)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluate Results\n",
    "Finally, we compute the accuracy, precision, recall, and F1 metrics of the evaluation on the test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy: 0.9421504965622612\n",
      "{\n",
      "    \"fiction\": {\n",
      "        \"f1-score\": 0.924482109227872,\n",
      "        \"precision\": 0.8953944368445053,\n",
      "        \"recall\": 0.9555231143552312,\n",
      "        \"support\": 10275\n",
      "    },\n",
      "    \"government\": {\n",
      "        \"f1-score\": 0.948873653281097,\n",
      "        \"precision\": 0.9565560821484992,\n",
      "        \"recall\": 0.9413136416634279,\n",
      "        \"support\": 10292\n",
      "    },\n",
      "    \"macro avg\": {\n",
      "        \"f1-score\": 0.9408187527049234,\n",
      "        \"precision\": 0.9413336757882582,\n",
      "        \"recall\": 0.9411302847360989,\n",
      "        \"support\": 52360\n",
      "    },\n",
      "    \"micro avg\": {\n",
      "        \"f1-score\": 0.9421504965622612,\n",
      "        \"precision\": 0.9421504965622612,\n",
      "        \"recall\": 0.9421504965622612,\n",
      "        \"support\": 52360\n",
      "    },\n",
      "    \"slate\": {\n",
      "        \"f1-score\": 0.8725352112676057,\n",
      "        \"precision\": 0.9031552639800062,\n",
      "        \"recall\": 0.8439233239272161,\n",
      "        \"support\": 10277\n",
      "    },\n",
      "    \"telephone\": {\n",
      "        \"f1-score\": 0.9935128410201723,\n",
      "        \"precision\": 0.9892929829218653,\n",
      "        \"recall\": 0.99776885319054,\n",
      "        \"support\": 11205\n",
      "    },\n",
      "    \"travel\": {\n",
      "        \"f1-score\": 0.9646899487278707,\n",
      "        \"precision\": 0.9622696130464151,\n",
      "        \"recall\": 0.9671224905440792,\n",
      "        \"support\": 10311\n",
      "    },\n",
      "    \"weighted avg\": {\n",
      "        \"f1-score\": 0.9417711062461178,\n",
      "        \"precision\": 0.942203390713011,\n",
      "        \"recall\": 0.9421504965622612,\n",
      "        \"support\": 52360\n",
      "    }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "report = classification_report(labels_test, preds, target_names=label_encoder.classes_, output_dict=True) \n",
    "accuracy = accuracy_score(labels_test, preds )\n",
    "print(\"accuracy: {}\".format(accuracy))\n",
    "print(json.dumps(report, indent=4, sort_keys=True))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.9421504965622612,
       "encoder": "json",
       "name": "accuracy",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "accuracy"
      }
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.9413336757882582,
       "encoder": "json",
       "name": "precision",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "precision"
      }
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.9411302847360989,
       "encoder": "json",
       "name": "recall",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "recall"
      }
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.9408187527049234,
       "encoder": "json",
       "name": "f1",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "f1"
      }
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# for testing\n",
    "sb.glue(\"accuracy\", accuracy)\n",
    "sb.glue(\"precision\", report[\"macro avg\"][\"precision\"])\n",
    "sb.glue(\"recall\", report[\"macro avg\"][\"recall\"])\n",
    "sb.glue(\"f1\", report[\"macro avg\"][\"f1-score\"])\n"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Tags",
  "kernelspec": {
   "display_name": "nlp_gpu",
   "language": "python",
   "name": "nlp_gpu"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
